Design and implement a deep learning model that learns to recognize traffic signs.
Train and test your model on the German Traffic Sign Dataset.
There are various aspects to consider when thinking about this problem:
Here is an example of a published baseline model on this problem. It's not required to be familiar with the approach used in the paper but, it's good practice to try to read papers like these.
NOTE: The LeNet-5 implementation shown in the classroom at the end of the CNN lesson is a solid starting point. You'll have to change the number of classes and possibly the preprocessing, but aside from that it's plug and play!
Table of Contents
# Hyperparameters
# Arguments used for tf.truncated_normal,
# which randomly defines variables for the weights and biases for each layer
mu = 0 # 0 seems
sigma = 0.3 # 0.1 seems good
EPOCHS = 7 # more the better, but longer, 6 achieve 98%
BATCH_SIZE = 256 # OK larger is faster, memory limited
# on MacBook Pro 2.3GHz i7, 16GB 1600MHz DDR3 RAM:
# 128 (slowest), 256 (faster), 512 (slower)
DROPOUT = 0.70 # 0.75
validation_split = 0.30 # we will use ~20% of TRAIN data for validation
best_model = "./model_0.884893309667"
import tensorflow as tf
from tqdm import tqdm
# tqdm shows a smart progress meter
# usage: tqdm(iterable)
# Load pickled German street signs dataset from:
# http://bit.ly/german_street_signs_dataset
# If file location is not correct you get
# FileNotFoundError: [Errno 2] No such file or directory
training_file = "/Users/ukilucas/dev/DATA/traffic-signs-data/train.p" # 120.7MB
testing_file = "/Users/ukilucas/dev/DATA/traffic-signs-data/test.p" # 38.9 MB
import pickle
with open(training_file, mode='rb') as f:
train = pickle.load(f)
with open(testing_file, mode='rb') as f:
test = pickle.load(f)
X_train, y_train = train['features'], train['labels']
X_test, y_test = test['features'], test['labels']
# Make sure the number of images in TRAIN set matches the number of labels
assert(len(X_train) == len(y_train))
# Make sure the number of images in TEST set matches the number of labels
assert(len(X_test) == len(y_test))
# name 'X_validation' is not defined
# assert(len(X_validation) == len(y_validation))
# print example of one image to see the dimentions of data
print()
print("Image Shape: {}".format(X_train[0].shape))
print()
# Image Shape: (32, 32, 3) - good for LeNet, no need for padding with zero
# print size of each set
training_set_size = len(X_train)
print("Training Set: {} samples".format(training_set_size))
# Training Set: 39209 samples
testing_set_size = len(X_test)
print("Test Set: {} samples".format(testing_set_size))
# Test Set: 12630 samples
#print("Validation Set: {} samples".format(len(X_validation)))
The pickled data is a dictionary with 4 key/value pairs:
'features' is a 4D array containing raw pixel data of the traffic sign images, (num examples, width, height, channels).'labels' is a 1D array containing the label/class id of the traffic sign. The file signnames.csv contains id -> name mappings for each id.'sizes' is a list containing tuples, (width, height) representing the the original width and height the image.'coords' is a list containing tuples, (x1, y1, x2, y2) representing coordinates of a bounding box around the sign in the image. THESE COORDINATES ASSUME THE ORIGINAL IMAGE. THE PICKLED DATA CONTAINS RESIZED VERSIONS (32 by 32) OF THESE IMAGESComplete the basic data summary below.
MEETS SPECIFICATIONS: Student performs basic data summary.
import numpy as np
### Replace each question mark with the appropriate value.
# TODO: Number of training examples
# Training Set: 39209 samples
n_train = X_train.shape[0]
# TODO: Number of testing examples.
# Test Set: 12630 samples
n_test = X_test.shape[0]
# TODO: What's the shape of an traffic sign image?
# Image Shape: (32, 32, 3)
image_shape = X_test.shape[1:3]
input_image_size = 32
number_of_channels = 3 # trying to keep color
# TODO: How many unique classes/labels there are in the dataset.
# see signnames.csv 43 elements (0 to 42)
number_train_labels = np.unique(y_train).shape[0]
print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes (training labels) =", number_train_labels)
np.unique(y_train)
# Output:
# Number of training examples = 39209
# Number of testing examples = 12630
# Image data shape = (32, 32)
# Number of classes = 43
MEETS SPECIFICATIONS: Student performs an exploratory visualization on the dataset.
Overview:
Visualize the German Traffic Signs Dataset using the pickled file(s). This is open ended, suggestions include: plotting traffic sign images, plotting the count of each sign, etc.
The Matplotlib examples and gallery pages are a great resource for doing visualizations in Python.
NOTE: It's recommended you start with something simple first. If you wish to do more, come back to it after you've completed the rest of the sections.
def human_readable_sign_names(sign_number):
return {
0: "Speed limit (20km/h)",
1: "Speed limit (30km/h)",
2: "Speed limit (50km/h)",
3: "Speed limit (60km/h)",
4: "Speed limit (70km/h)",
5: "Speed limit (80km/h)",
6: "End of speed limit (80km/h)",
7: "Speed limit (100km/h)",
8: "Speed limit (120km/h)",
9: "No passing",
10: "No passing for vehicles over 3.5 metric tons",
11: "Right-of-way at the next intersection",
12: "Priority road",
13: "Yield",
14: "Stop",
15: "No vehicles",
16: "Vehicles over 3.5 metric tons prohibited",
17: "No entry",
18: "General caution",
19: "Dangerous curve to the left",
20: "Dangerous curve to the right",
21: "Double curve",
22: "Bumpy road",
23: "Slippery road",
24: "Road narrows on the right",
25: "Road work",
26: "Traffic signals",
27: "Pedestrians",
28: "Children crossing",
29: "Bicycles crossing",
30: "Beware of ice/snow",
31: "Wild animals crossing",
32: "End of all speed and passing limits",
33: "Turn right ahead",
34: "Turn left ahead",
35: "Ahead only",
36: "Go straight or right",
37: "Go straight or left",
38: "Keep right",
39: "Keep left",
40: "Roundabout mandatory",
41: "End of no passing",
42: "End of no passing by vehicles over 3.5 metric tons"
}.get(sign_number, "Error: sign not found") # default if x not found
# TEST function
print( human_readable_sign_names(0))
print( human_readable_sign_names(28))
print( human_readable_sign_names(42))
print( human_readable_sign_names(43))
Please note the terrible quality of the images.
In this case, the computer might be able to detect the right sign better than human eye.
import random
# import numpy as np # already imported
import matplotlib.pyplot as plt
%matplotlib inline
figure, axes = plt.subplots(10, 5, figsize=(32,32))
for rows in range(10):
for columns in range(5):
index = random.randint(0, len(X_train))
image = X_train[index]#.squeeze()
axes[rows,columns].imshow(image)
axes[rows,columns].set_title(human_readable_sign_names(y_train[index]))
plt.show()
# train is the pickle file
histOut = plt.hist(train['labels'],number_train_labels, facecolor='g', alpha=0.60)
histOut = plt.hist(test['labels'],number_train_labels, facecolor='g', alpha=0.60)
Students provides sufficient details of the preprocessing techniques used. Additionally, the student discusses why the techniques were chosen.
Describe how you preprocessed the data. Why did you choose that technique?
Answer:
def scale_image_color_depth(value):
"""
normalizes image color depth values 0..255
to values between -0.5 and +0.5
"""
# image color depth has values 0 to 255
max_value = 255.0
# take the half value = 127.5
return ((value - max_value/2) / max_value)
# TEST:
print("normalized", scale_image_color_depth(0)) # min value
print("normalized", scale_image_color_depth(128)) # half value
print("normalized", scale_image_color_depth(255)) # max value
def scale_image_color_depth_for_all(image_set):
results = np.copy(image_set) # create placeholder
for i in tqdm(range(image_set.shape[0])):
results[i] = scale_image_color_depth(image_set[i].astype(float))
return results
The effect should be that the images with little contrast should be very readable now.
# http://docs.opencv.org/3.1.0/d5/daf/tutorial_py_histogram_equalization.html
import cv2
import numpy as np
from matplotlib import pyplot as plt
def histogram_equalization(image):
hist,bins = np.histogram(image.flatten(),256,[0,256])
cdf = hist.cumsum()
cdf_normalized = cdf * hist.max()/ cdf.max()
plt.plot(cdf_normalized, color = 'b')
plt.hist(image.flatten(),256,[0,256], color = 'r')
plt.xlim([0,256])
plt.legend(('cdf','histogram'), loc = 'upper left')
plt.show()
return image
def histogram_clahe(image):
# create a CLAHE object (Arguments are optional).
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
cl1 = clahe.apply(image)
#cv2.imwrite('clahe_2.jpg',cl1)
return cl1
image = cv2.imread('images/sample_set.jpg',0)
plt.imshow(image)
plt.show()
# histogram_equalization(image)
new_image = histogram_clahe(image)
plt.imshow(new_image)
plt.show()
# TRAIN FEATURES
# Apply histogram before changing color depth
# train_features = histogram_clahe(X_train.astype(float))
# Scale training set
train_features = scale_image_color_depth_for_all(X_train.astype(float))
# TEST FEATURES
# Apply histogram before changing color depth
# test_features = histogram_clahe(X_test.astype(float))
# Scale testing set
test_features = scale_image_color_depth_for_all(X_test.astype(float))
# TODO the application of histogram needs more love.
Use scaled values of the training set.
Student describes how the model was trained and evaluated. If the student generated additional data they discuss their process and reasoning. Additionally, the student discusses the difference between the new dataset with additional data, and the original dataset.
Describe how you set up the training, validation and testing data for your model. Optional: If you generated additional data, how did you generate the data? Why did you generate the data? What are the differences in the new dataset (with generated data) from the original dataset?
Answer:
from sklearn.cross_validation import train_test_split
# http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
X_train, X_validation, y_train, y_validation = train_test_split(train_features,
y_train,
test_size = validation_split,
random_state=42)
# Make sure the number of images in TRAIN set matches the number of labels
assert(len(X_train) == len(y_train))
print("len(X_train)", len(X_train))
# Make sure the number of images in TEST set matches the number of labels
assert(len(X_test) == len(y_test))
print("len(X_test)", len(X_test))
# name 'X_validation' is not defined
assert(len(X_validation) == len(y_validation))
print("len(X_validation)", len(X_validation))
from sklearn.utils import shuffle
X_train, y_train = shuffle(X_train, y_train)
x is a placeholder for a batch of input images.y is a placeholder for a batch of output labels.# x variable (Tensor) stores input batches
# None - later accepts batch of any size
# image dimentions 32x32x1
x = tf.placeholder(tf.float32, (None,
input_image_size,
input_image_size,
number_of_channels)) # (None, 32, 32, 3)
# y variable (Tensor) stores labels
y = tf.placeholder(tf.int32, (None)) # if using "None,number_of_channels" -> (None, 3)
# encode our labels
one_hot_y = tf.one_hot(y, number_train_labels) # 43
# See definition of the DROPOUT below
keep_prob = tf.placeholder(tf.float32)
Student provides sufficient details of the characteristics and qualities of the architecture, such as the type of model used, the number of layers, the size of each layer. Visualizations emphasizing particular qualities of the architecture are encouraged.
Implement the neural network architecture based on LeNet-5.
Input
The LeNet architecture accepts a 32x32xC number_color_channels.
What does your final architecture look like? (Type of model, layers, sizes, connectivity, etc.) For reference on how to build a deep neural network using TensorFlow, see Deep Neural Network in TensorFlow from the classroom.
Answer:
CONVOLUTIONAL LAYER L1
CONVOLUTIONAL LAYER L2
FLATTENING LAYER L3
CONVOLUTIONAL FULLY CONNECTED LAYER L5
CONVOLUTIONAL FULLY CONNECTED LAYER L6
def convolution_output_size(input_size=32, filter_size=5, stride_veritcal=1):
output_size = (input_size - filter_size + 1)/stride_veritcal
print("Calculated output size", output_size)
def pooling_layer(input_tensor):
# POOLING (SUBSAMPLING) LAYER L2
# Input = 28x28x6.
# Output = 14x14x6.
# value: A 4-D Tensor with shape [batch, height, width, channels] and type tf.float32.
# ksize: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor.
# strides: A list of ints that has length >= 4. The stride of the sliding window for each dimension of the input tensor.
# padding: A string, either 'VALID' or 'SAME'.
# name: Optional name for the operation.
tensor = tf.nn.max_pool(value = input_tensor,
ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1],
padding='VALID')
print("Pooling Layer Output", tensor)
#L2 output Tensor("L2:0", shape=(?, 14, 14, 6), dtype=float32)
return tensor
def convolution_layer(input_tensor, filter_size=5, input_depth=3, output_depth=6):
# L1 filter (5,5,3,6)
# L2 filter (5,5,6,16)
filter_tensor = tf.Variable(tf.truncated_normal(
shape=(filter_size,
filter_size,
input_depth,
output_depth),
mean = mu,
stddev = sigma))
bias = tf.Variable(tf.zeros(output_depth))
tensor = tf.nn.conv2d(input = input_tensor,
filter = filter_tensor,
strides = [1, 1, 1, 1],
padding='VALID'
) + bias
convolution_output_size(input_size=32, filter_size=filter_size, stride_veritcal=1)
# calculated output size 28.0
print("Convolution Output", tensor)
# L1 output Tensor("add:0", shape=(?, 28, 28, 6), dtype=float32)
# ReLU Activation function.
tensor = tf.nn.relu(features = tensor)
print("ReLU Activation funtion Output", tensor)
# ReLU output Tensor("Relu:0", shape=(?, 28, 28, 6), dtype=float32)
tensor = pooling_layer(input_tensor = tensor)
return tensor
def convolution_fully_connected(input_tensor, input_size=400, output_size=120):
# Fully Connected. Input = 400. Output = 120.
filter_tensor = tf.Variable(tf.truncated_normal(
shape=(input_size, output_size),
mean = mu, stddev = sigma))
bias = tf.Variable(tf.zeros(output_size))
tensor = tf.matmul(input_tensor, filter_tensor) + bias
print("Convolution Output", tensor)
# ReLu Activation.
tensor = tf.nn.relu(tensor)
print("ReLU Activation funtion Output", tensor)
return tensor
from tensorflow.contrib.layers import flatten
def convolutional_neural_network(tensor):
print("CONVOLUTIONAL LAYER L1")
tensor = convolution_layer(
input_tensor = tensor, filter_size=5, input_depth=3, output_depth=6)
print("CONVOLUTIONAL LAYER L2")
tensor = convolution_layer(
input_tensor = tensor, filter_size=5, input_depth=6, output_depth=16)
# Input Tensor("MaxPool_1:0", shape=(?, 5, 5, 16), dtype=float32)
print("FLATTENING LAYER L3")
# Flattens Input 5x5x16 = 400
tensor = flatten(tensor)
print("Flattened Output", tensor)
# Tensor("Flatten/Reshape:0", shape=(?, 400), dtype=float32)
print("CONVOLUTIONAL FULLY CONNECTED LAYER L4")
tensor = convolution_fully_connected(input_tensor=tensor, input_size=400, output_size=120)
# Convolution output tensor Tensor("add_2:0", shape=(?, 120), dtype=float32)
# ReLU output tensor Tensor("Relu_2:0", shape=(?, 120), dtype=float32)
print("CONVOLUTIONAL FULLY CONNECTED LAYER L5")
tensor = convolution_fully_connected(input_tensor=tensor, input_size=120, output_size=84)
# Convolution output tensor Tensor("add_3:0", shape=(?, 84), dtype=float32)
#ReLU output tensor Tensor("Relu_3:0", shape=(?, 84), dtype=float32)
print("CONVOLUTIONAL FULLY CONNECTED LAYER L6")
tensor = convolution_fully_connected(input_tensor=tensor, input_size=84, output_size=43)
# Convolution output tensor Tensor("add_4:0", shape=(?, 43), dtype=float32)
# ReLU output tensor Tensor("Relu_4:0", shape=(?, 43), dtype=float32)
return tensor # logits
Student describes how the model was trained and evaluated. If the student generated additional data they discuss their process and reasoning. Additionally, the student discusses the difference between the new dataset with additional data, and the original dataset.
Answer:
Create a training pipeline that uses the model to classify sign data.
logits = convolutional_neural_network(x)
Evaluate how well the loss and accuracy of the model for a given dataset.
Cross Entropy is the measure of how different are the logits (output classes) from ground truth training labels
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, one_hot_y)
print(cross_entropy)
# average entropy from all the training images
mean_loss_tensor = tf.reduce_mean(cross_entropy)
print("mean_loss_tensor",mean_loss_tensor)
_How did you train your model? (Type of optimizer, batch size, epochs, hyperparameters, etc.)_
Answer:
This is a minimize the loss function (similar to Stockastic Gradient Decent SGD does)
AdamOptimizer is more sofisticated, so it is a good default.
It uses moving averages of the parameters (momentum); Bengio discusses the reasons for why this is beneficial in Section 3.1.1 of this paper.
Simply put, this enables Adam to use a larger effective step size (rate), and the algorithm will converge to this step size without fine tuning
def print_hyper_parameters():
print("- mu", mu)
print("- sigma", sigma)
print("- EPOCHS", EPOCHS, "more the better, but achieving > 98% is a proof")
print("- BATCH SIZE", BATCH_SIZE, "best results with 256 on my computer")
print("- DROPOUT", DROPOUT, "used for keep_prob")
print_hyper_parameters()
To reduce overfitting, we will apply dropout before the readout layer. We create a placeholder for the probability that a neuron's output is kept during dropout. This allows us to turn dropout on during training, and turn it off during testing. TensorFlow's tf.nn.dropout op automatically handles scaling neuron outputs in addition to masking them, so dropout just works without any additional scaling.1
# learning rate (how quickly to update the networks weights)
rate = 0.001
adam_optimizer = tf.train.AdamOptimizer(learning_rate = rate)
print("adam_optimizer", adam_optimizer)
# uses backpropagation
adam_optimizer_minimize = adam_optimizer.minimize(mean_loss_tensor)
print("adam_optimizer_minimize", adam_optimizer_minimize)
# is prediction correct
are_preditions_correct = tf.equal(tf.argmax(logits, 1), tf.argmax(one_hot_y, 1))
print("are_preditions_correct", are_preditions_correct)
# calc model's overall accuracy by avegaring individual prediction acuracies
predition_mean = tf.reduce_mean(tf.cast(are_preditions_correct, tf.float32))
print("predition_mean", predition_mean)
def evaluate(X_data, y_data):
num_examples = len(X_data)
total_accuracy = 0
sess = tf.get_default_session()
for offset in tqdm(range(0, num_examples, BATCH_SIZE)):
batch_x, batch_y = X_data[offset:offset+BATCH_SIZE], y_data[offset:offset+BATCH_SIZE]
accuracy = sess.run(predition_mean, feed_dict={x: batch_x, y: batch_y})
total_accuracy += (accuracy * len(batch_x))
return total_accuracy / num_examples
import time
start = time.time()
saver = tf.train.Saver()
vector_accurancies = []
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print("Training...")
print_hyper_parameters()
print()
for i in range(EPOCHS):
# shuffle: make sure it is not biased by the order of images
X_train, y_train = shuffle(X_train, y_train)
# break training model into batches,
# train the model on each batch
for offset in range(0, training_set_size, BATCH_SIZE):
end = offset + BATCH_SIZE
# print("running batch from", offset, " step ", end, " up to ", training_set_size)
batch_x = X_train[offset:end]
batch_y = y_train[offset:end]
sess.run(adam_optimizer_minimize,
feed_dict={x: batch_x, y: batch_y, keep_prob: DROPOUT})
# at the end of each epoch, evaluate against validation set
validation_accuracy = evaluate(X_validation, y_validation)
vector_accurancies.extend([validation_accuracy])
print("EPOCH {} ...".format(i+1),
"Validation Accuracy = {:.3f}".format(validation_accuracy))
# EPOCH 1 ... Validation Accuracy = 0.300 - very low
# EPOCH 2 ... Validation Accuracy = 0.500 - growing
# training for 2 epochs and 256 batch size took 41.1 seconds
print()
end = time.time()
print('Training for {} epochs and {} batch size took {} seconds'.format(
EPOCHS, BATCH_SIZE, round(end - start,1)))
# upon training complete, save it so we do not have to train again
# assure to save the model with achieved accuracy,
# this way later I can select the best run
saver.save(sess, './model_' + str(validation_accuracy))
print("Model saved")
import matplotlib.pyplot as plt
plt.plot(vector_accurancies)
plt.xlabel('EPOCHS')
plt.ylabel('PERCENTAGE')
plt.show()
# Example Run 100 epochs
vector_accurancies = [ 0.210, 0.357, 0.419, 0.583, 0.625, 0.633, 0.639,
0.724, 0.694, 0.705, 0.734, 0.676, 0.747, 0.770,
0.749, 0.728, 0.803, 0.807, 0.786, 0.760, 0.784,
0.833, 0.826, 0.826, 0.838, 0.835, 0.789, 0.764,
0.825, 0.841, 0.847, 0.849, 0.853, 0.851, 0.842,
0.807, 0.832, 0.760, 0.821, 0.847, 0.849, 0.858,
0.848, 0.854, 0.857, 0.863, 0.860, 0.861, 0.862,
0.862, 0.849, 0.714, 0.852, 0.859, 0.841, 0.856,
0.862, 0.861, 0.864, 0.855, 0.837, 0.847, 0.763,
0.870, 0.875, 0.873, 0.866, 0.880, 0.877, 0.883,
0.882, 0.886, 0.885, 0.886, 0.884, 0.886, 0.885,
0.886, 0.887, 0.886, 0.886, 0.886, 0.886, 0.887,
0.886, 0.887, 0.887, 0.887, 0.365, 0.827, 0.867,
0.865, 0.875, 0.882, 0.885, 0.885, 0.885, 0.885,
0.886, 0.885]
import matplotlib.pyplot as plt
plt.plot(vector_accurancies)
plt.xlabel('EPOCHS')
plt.ylabel('PERCENTAGE')
plt.show()
_What approach did you take in coming up with a solution to this problem? It may have been a process of trial and error, in which case, outline the steps you took to get to the final solution and why you chose those steps. Perhaps your solution involved an already well known implementation or architecture. In this case, discuss why you think this is suitable for the current problem._
Answer:
Training...
100%|██████████| 16/16 [00:03<00:00, 4.91it/s] EPOCH 1 ... Validation Accuracy = 0.181
100%|██████████| 16/16 [00:01<00:00, 8.14it/s] EPOCH 97 ... Validation Accuracy = 0.803
100%|██████████| 16/16 [00:02<00:00, 7.70it/s] EPOCH 98 ... Validation Accuracy = 0.802
100%|██████████| 16/16 [00:01<00:00, 8.09it/s] EPOCH 99 ... Validation Accuracy = 0.803
100%|██████████| 16/16 [00:02<00:00, 7.89it/s] EPOCH 100 ... Validation Accuracy = 0.803
Training for 100 epochs and 512 batch size took 3649.5 seconds Model saved
# I will be updating the model name to the HIGHEST accurancy achieved
with tf.Session() as sess:
print ('Re-loading saved model' + best_model)
saver.restore(sess, best_model)
test_accuracy = evaluate(X_test, y_test)
print("Evaluating the TEST set agaist restored model trained to 88.4%, result = {:.1f}% accurancy".format(test_accuracy*100))
Take several pictures of traffic signs that you find on the web or around you (at least five), and run them through your classifier on your computer to produce example results. The classifier might not recognize some local signs but it could prove interesting nonetheless.
You may find signnames.csv useful as it contains mappings from the class id (integer) to the actual sign name.
Choose five candidate images of traffic signs and provide them in the report. Are there any particular qualities of the image(s) that might make classification difficult? It could be helpful to plot the images in the notebook.
</font>
Answer
I do not understand why we use such a bad training set of 32x32 images, it makes sense for characted recognition, but not for signs with important text inside.
directory = "images/verification"
prepended_by = "slip_"
import os
listing = os.listdir(directory)
print (len(listing))
listing[5]
from skimage import io
import numpy as np
from matplotlib import pyplot as plt
# count and display valid images
counter = 0
for i in range(len(listing)):
if ".jpg" not in listing[i]:
print("ignoring", listing[i])
continue
if prepended_by not in listing[i]:
print("ignoring", listing[i])
continue
image = io.imread(directory + "/" + listing[i])
plt.figure(figsize=(2,2))
plt.imshow(image)
plt.show()
if "32x32" in listing[i]:
counter = counter + 1
print("counter", counter)
image_matrix = np.uint8(np.zeros((counter, 32, 32, 3)))
print("image_matrix", image_matrix.shape)
index = -1
for i in range(len(listing)):
if ".jpg" not in listing[i]:
print("ignoring", listing[i])
continue
if prepended_by not in listing[i]:
print("ignoring", listing[i])
continue
if "32x32" in listing[i]:
image = io.imread(directory + "/" + listing[i])
index = index + 1
image_matrix[index] = image
print("adding", listing[i], "@ index", index)
image_matrix.shape
Student documents the performance of the model when tested on the captured images and compares it to the results of testing on the dataset.
Is your model able to perform equally well on captured pictures when compared to testing on the dataset? The simplest way to do this check the accuracy of the predictions. For example, if the model predicted 1 out of 5 signs correctly, it's 20% accurate.
NOTE: You could check the accuracy manually by using signnames.csv (same directory). This file has a mapping from the class id (0-42) to the corresponding sign name. So, you could take the class id the model outputs, lookup the name in signnames.csv and see if it matches the sign from the image.
Answer:
items = len(image_matrix)
print("Number of new images:", items)
new_images = scale_image_color_depth_for_all(
image_matrix.reshape((-1, 32, 32, 3)).astype(float))
new_labels = [23] * items
print(new_labels)
human_readable_sign_names(23)
# I will be updating the model name to the HIGHEST accurancy achieved
with tf.Session() as sess:
print ('Re-loading saved model' + best_model)
saver.restore(sess, best_model)
test_accuracy = evaluate(new_images, new_labels)
print("Evaluating the TEST set agaist restored model trained to 88.4%, result = {:.1f}% accurancy".format(test_accuracy*100))
The softmax probabilities of the predictions on the captured images are visualized. The student discusses how certain or uncertain the model is of its predictions.
Answer:
softmax_tensor = tf.nn.softmax(logits)
def classify_images(X_data):
session = tf.get_default_session()
predicted_tensor = session.run(softmax_tensor, feed_dict={x: X_data, keep_prob: 0.8})
return predicted_tensor
with tf.Session() as sess:
print ('Re-loading saved model' + best_model)
saver.restore(sess, best_model)
predictions = classify_images(new_images)
top_k_tensor = sess.run(tf.nn.top_k(predictions, 5, sorted=True))
label_indexes = np.argmax(top_k_tensor, 1)
values = label_indexes[1,1:]
for index in tqdm(range(len(values))):
print(human_readable_sign_names(values[index]), values[index])
### Visualize the softmax probabilities
top = 5
for i in range(top):
predictions = top_k_tensor[0][i]
plt.title('Top {} Softmax probabilities for option {}'.format(top, str(i)))
plt.figure(i)
plt.xlabel('label #')
plt.ylabel('prediction')
plt.bar(range(top), predictions, 0.10, color='b')
plt.xticks(np.arange(top) + 0.10, tuple(predictions))
plt.show()
Note: Once you have completed all of the code implementations and successfully answered each question above, you may finalize your work by exporting the iPython Notebook as an HTML document. You can do this by using the menu above and navigating to \n", "File -> Download as -> HTML (.html). Include the finished document along with this notebook as your submission.